TITLE OF THE INVENTION 
IMAGE PROCESSING METHOD AND APPARATUS AND 
STORAGE MEDIUM 



5 FIELD OF THE INVENTION 

The present invention relates to an image 
processing method and apparatus for correcting the 
positional offset of an input image with respect to a 
reference image and a storage medium. 

10 

BACKGROUND OF THE INVENTION 

In the field of document processing in which a 
large quantity of documents are processed collectively, 
documents are generally processed in accordance with 

15 document images to which pieces of processing control 

information permanently set for the respective types of 
documents, i.e., information indicating specific 
positions of documents at which character recognition 
is to be performed, information indicting specific 

20 areas of documents from which information is to be 
extracted, and the like, are input. 

In consideration of physical errors in a read 
mechanism and instability of paper documents themselves, 
it is almost impossible to read a large quantity of 

25 document images one by one accurately at the same 
position by using a scanner. This tendency has 
recently become increasingly conspicuous with an 
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increase in the processing speed of scanners. 

When processing is to be performed on the basis 
of permanent positional information in this situation 
in the above manner, a decrease in the precision of 
5 subsequent processing, e.g., character recognition, due 
to a positional offset is inevitable. 

Conventionally, to prevent such a problem, 
positioning markings are formed on documents themselves 
to obtain the reference position of each document, and 

10 various processes are performed on the basis of the 
position of a predetermined processing target area 
relative to the reference position. Alternatively, the 
layout of a document itself is designed to set a large 
margin for a positional offset, or a high-resolution 

15 scanner is used. 

The conventional document positional offset 
preventing method described above is subjected to 
strict constraints concerning document design. A 
high-resolution scanner leads to an increase in cost. 

20 These factors have greatly interfered with efficient 
document processing. Another serious problem is that 
it is almost impossible to apply this method to read 
processing systems for processing different types of 
documents, which tend to become mainstream. 

25 The present invention has been made in 

consideration of the above problem, and has as its 
object to correct the positional offset of an image 
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without performing any processing for the setting of a 
reference position with respect to a document image, 
e.g., the setting of markings. 



SUMMARY OF THE INVENTION 
In order to achieve the object of the present 
invention, for example, an image processing apparatus 
of the present invention has the following arrangement. 

There is provided an image processing apparatus 
for correcting a positional offset of an input image 
with respect to a reference image, comprising storage 
means for storing information about the reference image, 
including a reference position, area information 
specifying means for obtaining information about a 
plurality of areas included in the input image, target 
position calculating means for calculating a target 
position on the input image on the basis of the 
information obtained by the area information specifying 
means, calculating means for specifying information 
about the reference image in accordance with the input 
image on the basis of information from the storage 
means, and calculating a positional offset between the 
reference position included in the specified 
information and the target position, and correcting 
means for correcting positions of a plurality of areas 
included in the input image by using the offset 
calculated by the calculating means. 



In addition, the target position calculating 
means obtains a leftmost end/uppermost end position of 
a plurality of areas included in the input image and 
sets the position as the target position. 

Furthermore, the target position calculating 
means further comprises removing means for removing an 
unstable area from a plurality of areas included in the 
input image, and calculates a target position for the 
input image by using areas left after area removal 
performed by the removing means. 

Other features and advantages of the present 
invention will be apparent from the following 
description taken in conjunction with the accompanying 
drawings, in which like reference characters designate 
the same or similar parts throughout the figures 
thereof . 

BRIEF DESCRIPTION OF THE DRAWINGS 
The accompanying drawings, which are incorporated 
in and constitute a part of the specification, 
illustrate embodiments of the invention and, together 
with the description, serve to explain the principles 
of the invention. 

Fig. 1 is a block diagram showing the schematic 
arrangement of an image processing apparatus according 
to the first embodiment of the present invention; 
Fig. 2 is a flow chart for a case where a 



processor 4 processes one document; 

Fig. 3 is a view for explaining the step of 
calculating an positional offset amount in the 
processor 4 and the step of correcting a processing 
5 position; 

Fig. 4 is a flow chart showing a procedure for 
calculating a document origin; 

Fig. 5A is a view for explaining block selection 

Fig. 5B is a view for explaining block selection 

10 and 

Fig. 5C is a view for explaining block selection 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 
Preferred embodiments of the present invention 
15 will now be described in detail in accordance with the 
accompanying drawings. 
[First Embodiment] 

Fig. 1 is a view showing the schematic 
arrangement of an image processing apparatus according 
20 to the first embodiment, which performs document 
processing to be described later. 

Reference numeral 2 denotes an image input means 
such as a scanner, camera, or file reading unit which 
inputs a document image; 4, a processor for performing 
25 document processing to be described later; 6, a 

pointing device such as a keyboard or mouse which 
inputs instructions to the processor 4; 8, a disk for 
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storing reference data for document recognition or 
processing control information unique to a document; 10, 
a memory in which the processor 4 temporarily stores 
document processing data or the document image read by 
5 the image input means 2 is stored; 12, an output means 
such as a display or printer which outputs a processing 
result; and 14, a ROM storing program codes by which 
the processor 4 executes various processes. 

The operation of the image processing apparatus 

10 in this embodiment having the above arrangement will be 
described next. First of all, in accordance with the 
instructions input from the pointing device 6, the 
document image converted into an electronic form by the 
image input means 2 is acquired and bitmapped in the 

15 memory 10. The bitmapped document image is subjected 

to area identification in the processor 4. Thereafter, 
document recognition, positional offset detection, and 
various document processes (character recognition and 
the like) are performed for the document image. The 

20 processing result is output through the output means 12 
such as a display or printer. 

Various control processes executed by the image 
processing apparatus of this embodiment, and more 
specifically, the processor 4 will be described with 

25 reference to Figs. 2 and 3. 

Fig. 2 is a flow chart for a case where the 
processor 4 processes one document. The program codes 



- 6 - 



conforming to the flow chart of Fig. 2 are stored in 
the ROM 14 and are read out and executed by the 
processor 4. With this operation, the image processing 
apparatus of this embodiment executes each process to 
be described later. 

In step S200, the processor 4 receives a document 
image from the image input means 2 and transfers it as 
image data to the memory 10. 

In step S202, the processor 4 performs area 
identification of the document image bitmapped in the 
memory 10 in step S200. This operation can be 
implemented by applying the block selection technique 
and the like disclosed in, for example, Japanese Patent 
Laid-open No. 6-068301. In this operation, an area 
(block) having the same attribute on the document is 
extracted in accordance with the input image 
information, and area identification information such 
as an attribute, size, and position is specified. 

In step S204, document identification is 
performed to identify the input document on the basis 
of the area identification information extracted in 
step S202. 

In step S206, processing control information 
(including an original document origin) unique to the 
document identified in step S204 is extracted from a 
database in the disk 8, and transferred to the memory 
10. 



In step S208;. an input document origin is 
generated from the area identification information 
extracted in step S202. 

In step S210, the processor 4 calculates the 
amount of positional offset (document offset) between 
the input document origin obtained in step S208 and the 
original document origin transferred into the memory 10 
in step S206. 

In step S212, the processor 4 corrects the 
positional information of the target area in the 
processing control information of the original document 
by using the positional offset amount calculated in 
step S210. 

Steps S210 and S212 will be described in detail 

later . 

In step S214, the processor 4 performs various 
processes such as character recognition on the basis of 
the positional information of the target area of the 
document corrected in step S212. Specific instructions 
for such processes are stored in the processing control 
information. 

In step S216, the output means 12 outputs the 
results obtained by the processes performed in step 
S214 . 

Fig. 3 is a view for explaining the step of 
calculating a positional offset amount in the processor 
4 in step S210 and the step of performing processing 



position correction in step S212. 

The left side of Fig. 3 shows the state of an 
image when an original document is registered in the 
above database. When the image to be registered is 
5 read, area identification is performed for the read 
image. In the state indicated by the left side of 
Fig. 3, an OCR area and image extraction area are 
identified and acquired as area identification 
information. An original document origin is then 

10 determined by using this area identification 

information. In this embodiment, referring to Fig. 3, 
the original document origin is set to (50, 50) in the 
same manner as the processing contents in step S208. 
This original document origin is registered as 

15 processing control information of the corresponding 
document in the above database, together with an OCR 
application position (100, 100) in the OCR area in 
Fig. 3 and an image extraction position (200, 400) in 
the image extraction area. In addition, in the case of 

20 this document, the size of the OCR area, a character 
recognition processing instruction, the size of the 
image extraction area, and an extraction instruction 
are also registered as processing control information 
in the database. 

25 The right side of Fig. 3 shows an example of the 

state where a document to be processed is input. When 
the document to be processed is input, area 
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identification is performed to identify an OCR area and 
image extraction area (step S202), and the input 
document is identified (step S204). An input document 
origin is then generated by using the area 
5 identification information acquired by area 

identification (step S208) . When this input document 
origin is compared with the original document origin 
read out in step S206, the occurrence of a positional 
offset between the image obtained when the original 

10 document is registered in the database and the read 
position can be detected from the offset between the 
original document origin position shown on the left 
side of Fig. 3 and the input document origin position 
shown on the right side of Fig. 3 (step S210) . 

15 In the step (step S210) of calculating a 

positional offset amount in the processor 4 with 
respect to this offset amount, the positional offset 
amount is obtained by subtracting the original document 
origin from the input document origin obtained in step 

20 S208 as indicated by the lower portion of Fig. 3. In 
the processing position correction step (step S212), 
the positional offset amount is added to the OCR 
application position coordinates and image extraction 
position coordinates, thereby obtaining a more accurate 

25 processing application position (the OCR position (160, 
160) and image extraction position (260, 460) ) . 

As described above, in the image processing 
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method and apparatus according to this embodiment, even 
in batch processing of different types of documents, 
the amount of positional offset caused between an 
original document and an input document can be 
5 calculated by extracting universal features unique to a 
document and determining a document origin without 
relying on markings or the like in setting a reference 
position for document offset correction. This makes it 
possible to correct the document positional offset. 

10 [Second Embodiment] 

In the first embodiment, a document origin is set 
at an upper left position on a document. The present 
invention is not limited to this. For example, a 
document origin may be set at a lower right position or 

15 to the barycentric average of objects. 
[Third Embodiment] 

In this first embodiment, as processes in a 
document, character recognition and image extraction 
are used. However, the present invention is not 

20 limited to this. Obviously, the processes include any 
instructions associated with document processing, e.g., 
an image compression instruction, summarizing 
instruction, translation instruction, read-aloud 
instruction, and seal-impression collation instruction. 

25 [Fourth Embodiment] 

In this embodiment, an example of the step of 
calculating a document origin (original document origin 
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and input document origin) in the first embodiment will 
be described. 

Fig. 4 is a flow chart showing the above 
processing. This processing will be described below 
5 with reference to this flow chart. 

In step S400, as blocks for the formation of a 
document origin from area identification information, 
blocks having a table attribute, text attribute, title 
attribute, and frame attribute are selected. As a 
10 result, in the document image, block areas having the 
respective attributes can be specified, as shown in 
Fig. 5A. 

In step S402, unstable blocks (text blocks 
containing noise in this embodiment) are removed from 

15 the block areas selected in step S400. In this case, 
for example, character recognition is performed for 
each of the respective text blocks selected in step 
S400, and only blocks whose average scores are equal to 
or more than a predetermined value are left as text 

20 blocks for the formation of a document origin. More 
specifically, this operation is performed to remove a 
noise area itself or a text block including a noise 
area because it degrades the document origin formation 
precision. Fig. 5B shows the resultant document image. 

25 In step S404, the coordinates of the leftmost end 

and uppermost end of the block areas finally left after 
selection in steps S400 and S402 are obtained to 
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determine a document origin (Fig. 5C) . 

A document origin can be calculated by the above 
method . 

In step S404, the leftmost end and uppermost end 
5 coordinates are obtained from the remaining block areas. 
However, the rightmost end coordinates or lowermost end 
coordinates may be obtained. 
[Fifth Embodiment] 

In the fourth embodiment, in step S400, areas 
10 having text, title, frame, and table attributes as 

block attributes are selected. The present invention 
is not limited to this. For example, only areas having 
table and frame attributes or text and title attributes 
may be selected. That is, any combination of 
15 attributes can be set, and any block attributes can be 
set as long as they represent features of a document 
(cells in a table and the like) . 
[Sixth Embodiment] 

In the fourth embodiment, in step S402, an 
20 average score of character recognition is used as a 

criterion for the removal of unstable areas. However, 
the present invention is not limited to this. For 
example, small character sizes or text area positions 
may be used as criteria. 
2 5 [Other Embodiment] 

The present invention may be applied to a system 
constituted by a plurality of devices (e.g., a host 
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computer, an interface device, a reader, a printer, and 
the like) or an apparatus comprising a single device 
(e.g., a copying machine, a facsimile apparatus, or the 
like) . 

5 The object of the present invention is realized 

even by supplying a storage medium storing software 
program codes for realizing the functions of the 
above-described embodiments to a system or apparatus, 
and causing the computer (or a CPU or an MPU) of the 

10 system or apparatus to read out and execute the program 
codes stored in the storage medium. In this case, the 
program codes read out from the storage medium realize 
the functions of the above-described embodiments by 
themselves, and the storage medium storing the program 

15 codes constitutes the present invention. The functions 
of the above-described embodiments are realized not only 
when the readout program codes are executed by the 
computer but also when the OS (Operating System) running 
on the computer performs part or all of actual 

2 0 processing on the basis of the instructions of the 
program codes. 

The functions of the above-described embodiments 
are also realized when the program codes read out from 
the storage medium are written in the memory of a 

25 function expansion board inserted into the computer or a 
function expansion unit connected to the computer, and 
the CPU of the function expansion board or function 
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expansion unit performs part or all of actual processing 
on the basis of the instructions of the program codes. 

When the present invention is to be applied to the 
above storage medium, program codes corresponding to the 
5 flow charts (shown in Fig. 2 and/or Fig. 4) descried 
above are stored in the storage medium. 

As has been described above, according to the 
present invention, the positional offset of an image 
can be corrected without performing any processing for 

10 the setting of a reference position with respect to a 
document image, e.g., the setting of markings. This 
makes it possible to reduce the load imposed on the 
user in performing the correction processing as 
compared with the prior art. 

15 As many apparently widely different embodiments of 

the present invention can be made without departing from 
the spirit and scope thereof, it is to be understood 
that the invention is not limited to the specific 
embodiments thereof except as defined in the appended 

2 0 claims. 
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